Using Twitter to Examine Smoking Behavior and Perceptions of Emerging Tobacco Products

نویسندگان

  • Nathan Cobb
  • Sherry Emery
  • Tania Hernández
  • Mark Myslín
  • Shu-Hong Zhu
  • Wendy Chapman
  • Mike Conway
چکیده

BACKGROUND Social media platforms such as Twitter are rapidly becoming key resources for public health surveillance applications, yet little is known about Twitter users' levels of informedness and sentiment toward tobacco, especially with regard to the emerging tobacco control challenges posed by hookah and electronic cigarettes. OBJECTIVE To develop a content and sentiment analysis of tobacco-related Twitter posts and build machine learning classifiers to detect tobacco-relevant posts and sentiment towards tobacco, with a particular focus on new and emerging products like hookah and electronic cigarettes. METHODS We collected 7362 tobacco-related Twitter posts at 15-day intervals from December 2011 to July 2012. Each tweet was manually classified using a triaxial scheme, capturing genre, theme, and sentiment. Using the collected data, machine-learning classifiers were trained to detect tobacco-related vs irrelevant tweets as well as positive vs negative sentiment, using Naïve Bayes, k-nearest neighbors, and Support Vector Machine (SVM) algorithms. Finally, phi contingency coefficients were computed between each of the categories to discover emergent patterns. RESULTS The most prevalent genres were first- and second-hand experience and opinion, and the most frequent themes were hookah, cessation, and pleasure. Sentiment toward tobacco was overall more positive (1939/4215, 46% of tweets) than negative (1349/4215, 32%) or neutral among tweets mentioning it, even excluding the 9% of tweets categorized as marketing. Three separate metrics converged to support an emergent distinction between, on one hand, hookah and electronic cigarettes corresponding to positive sentiment, and on the other hand, traditional tobacco products and more general references corresponding to negative sentiment. These metrics included correlations between categories in the annotation scheme (phihookah-positive=0.39; phi(e-cigs)-positive=0.19); correlations between search keywords and sentiment (χ²₄=414.50, P<.001, Cramer's V=0.36), and the most discriminating unigram features for positive and negative sentiment ranked by log odds ratio in the machine learning component of the study. In the automated classification tasks, SVMs using a relatively small number of unigram features (500) achieved best performance in discriminating tobacco-related from unrelated tweets (F score=0.85). CONCLUSIONS Novel insights available through Twitter for tobacco surveillance are attested through the high prevalence of positive sentiment. This positive sentiment is correlated in complex ways with social image, personal experience, and recently popular products such as hookah and electronic cigarettes. Several apparent perceptual disconnects between these products and their health effects suggest opportunities for tobacco control education. Finally, machine classification of tobacco-related posts shows a promising edge over strictly keyword-based approaches, yielding an improved signal-to-noise ratio in Twitter data and paving the way for automated tobacco surveillance applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Factors Related with Tobacco Smoking among College Students: The Application of the Extended Theory of Planned Behavior

Aims: Today, the prevalence of tobacco smoking among students is increasing. Therefore, the purpose of the present study was to determine the factors related with tobacco smoking among students using extended theory of planned behavior. Instrument & Methods: This cross-sectional descriptive-analytical study was conducted on 360 students of Universities of Malayer city who were selected by stra...

متن کامل

Perceptions of Menthol Cigarettes Among Twitter Users: Content and Sentiment Analysis

BACKGROUND Menthol cigarettes are used disproportionately by African American, female, and adolescent smokers. Twitter is also used disproportionately by minority and younger populations, providing a unique window into conversations reflecting social norms, behavioral intentions, and sentiment toward menthol cigarettes. OBJECTIVE Our purpose was to identify the content and frequency of conver...

متن کامل

E-Cigarette Surveillance With Social Media Data: Social Bots, Emerging Topics, and Trends

BACKGROUND As e-cigarette use rapidly increases in popularity, data from online social systems (Twitter, Instagram, Google Web Search) can be used to capture and describe the social and environmental context in which individuals use, perceive, and are marketed this tobacco product. Social media data may serve as a massive focus group where people organically discuss e-cigarettes unprimed by a r...

متن کامل

The Indirect Influence of Tobacco Advertising on Smoking Susceptibility: A Case of Teenagers from Hispanic Communities

In the context of Hispanic teen smoking, this study examined the influence of pro-tobacco advertising and smoking warnings on teens from Hispanic communities. Analyzing data of two strata that had the highest percentage of urban (N = 1206) and non-urban (N = 385) Hispanic teenagers aged 13-17 from the 2012 National Youth Tobacco Survey, we found that pro-tobacco advertising indirectly influence...

متن کامل

Judgments, awareness, and the use of snus among adults in the United States.

INTRODUCTION Alternative tobacco products, such as snus, are emerging in the U.S. market. Understanding correlates of awareness and use, particularly judgments about harm and addictiveness, can inform public health communications about these products. METHODS Data were collected from a web panel representative of the U.S. population in March 2013 (N = 2,067). The survey assessed awareness and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2013